skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Zhang, Xinyue"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. In closed-domain Question Answering (QA), Large Language Models (LLMs) often fail to deliver responses specialized enough for niche subdomains. Broadly trained models may not capture the nuanced terminology and contextual precision required in these fields, which frequently lack domain-specific conversational data and face computational constraints. To address this, we propose a methodology leveraging a Retrieval-Augmented Generation (RAG) framework that integrates data extraction with fine-tuning using domain-specific question-answer pairs. Our approach employs Question-Answer Generation (QAG) to create tailored training datasets, enabling fine-tuned models to incorporate specialized jargon and context while remaining computationally accessible to domain experts. To exemplify this methodology, we demonstrate its application within the medical domain through a case study centered on the creation of a dementia care chat assistant. A significant benefit of this approach lies in its ease of replication across various domains and scalability for integration into diverse user groups, making it a versatile solution for enhancing chat assistants. 
    more » « less
    Free, publicly-accessible full text available June 24, 2026
  2. Free, publicly-accessible full text available May 8, 2026
  3. Free, publicly-accessible full text available March 17, 2026
  4. Abstract The Event Horizon Telescope (EHT) has produced resolved images of the supermassive black holes (SMBHs) Sgr A* and M87*, which present the largest shadows on the sky. In the next decade, technological improvements and extensions to the array will enable access to a greater number of sources, unlocking studies of a larger population of SMBHs through direct imaging. In this paper, we identify 12 of the most promising sources beyond Sgr A* and M87* based on their angular size and millimeter flux density. For each of these sources, we make theoretical predictions for their observable properties by ray tracing general relativistic magnetohydrodynamic models appropriately scaled to each target’s mass, distance, and flux density. We predict that these sources would have somewhat higher Eddington ratios than M87*, which may result in larger optical and Faraday depths than previous EHT targets. Despite this, we find that visibility amplitude size constraints can plausibly recover masses within a factor of 2, although the unknown jet contribution remains a significant uncertainty. We find that the linearly polarized structure evolves substantially with the Eddington ratio, with greater evolution at larger inclinations, complicating potential spin inferences for inclined sources. We discuss the importance of 345 GHz observations, milli-Jansky baseline sensitivity, and independent inclination constraints for future observations with upgrades to the EHT through ground updates with the next-generation EHT program and extensions to space through the black hole Explorer. 
    more » « less
    Free, publicly-accessible full text available May 13, 2026
  5. Deep learning (DL) has attracted interest in healthcare for disease diagnosis systems in medical imaging analysis (MedIA) and is especially applicable in Big Data environments like federated learning (FL) and edge computing. However, there is little research into mitigating the vulnerabilities and robustness of such systems against adversarial attacks, which can force DL models to misclassify, leading to concerns about diagnosis accuracy. This paper aims to evaluate the robustness and scalability of DL models for MedIA applications against adversarial attacks while ensuring their applicability in FL settings with Big Data. We fine-tune three state-of-the-art transfer learning models, DenseNet121, MobileNet-V2, and ResNet50, on several MedIA datasets of varying sizes and show that they are effective at disease diagnosis. We then apply the Fast Gradient Sign Method (FGSM) to attack the models and utilize adversarial training (AT) and knowledge distillation to defend them. We provide a performance comparison of the original transfer learning models and the defended models on the clean and perturbed data. The experimental results show that the defensive techniques can improve the robustness of the models to the FGSM attack and be scaled for Big Data as well as utilized for edge computing environments. 
    more » « less
    Free, publicly-accessible full text available December 15, 2025
  6. Machine learning has been successfully applied to big data analytics across various disciplines. However, as data is collected from diverse sectors, much of it is private and confidential. At the same time, one of the major challenges in machine learning is the slow training speed of large models, which often requires high-performance servers or cloud services. To protect data privacy while still allowing model training on such servers, privacy-preserving machine learning using Fully Homomorphic Encryption (FHE) has gained significant attention. However, its widespread adoption is hindered by performance degradation. This paper presents our experiments on training models over encrypted data using FHE. The results show that while FHE ensures privacy, it can significantly degrade performance, requiring complex tuning to optimize. 
    more » « less
    Free, publicly-accessible full text available December 15, 2025